STA 4107/5107 Statistical Learning: Principle Components and Partial Least Squares Regression
ثبت نشده
چکیده
Principal components analysis is traditionally presented as an interpretive multivariate technique, where the loadings are chosen to maximally explain the variance in the variable. However, we will consider it here mainly as a statistical learning tool, by using the derived components in a least squares regression to predict unobserved response variables using the principal components. Principal components aims to explain as much of the variation in the data as possible by finding linear combinations that are independent of each other and in the direction of the greatest variation. Each principal component is a linear combination of all variables. The first principal component explains the most variation, the second PC the second most, and so on. There are as many principal components as there are variables, but we usually choose only the first few for both exploratory and regression analysis. Partial least squares is a method of data dimension reduction, similar to principal components, to find the most relevant factors for both prediction and interpretation, and is derived from Herman Wold’s development of iterative fitting of bilinear models (Wold, 1981, 1983). Partial least squares regression (PLSR) improves upon principal components analysis by actively using the response variables during the bilinear decomposition of the predictors. Principal components focuses on the variance in the predictors, while partial least squares focuses on the covariance between the response and the predictors. By balancing the information in both the predictors and the response, PLS reduces the impact of large, but irrelevant predictor variations. Estimation of prediction error is achieved using cross-validation.
منابع مشابه
Determination of the Colorants in Various Samples by Chemometric Methods Using Statistical Chemistry
partial least square and principal component regression methods were applied to various mixtures of Allura Red and Brilliant Blue to determine the concentrations. Colorants, at the same time, were analyzed with UV-spectrophotometry in chemical separation. The obtained experimental data have been evaluated by chemometric methods as Partial Least Squares (PLS) and Principle Component Regressi...
متن کاملLocal Dimensionality Reduction
If globally high dimensional data has locally only low dimensional distributions, it is advantageous to perform a local dimensionality reduction before further processing the data. In this paper we examine several techniques for local dimensionality reduction in the context of locally weighted linear regression. As possible candidates, we derive local versions of factor analysis regression, pri...
متن کاملKernel Partial Least Squares is Universally Consistent
We prove the statistical consistency of kernel Partial Least Squares Regression applied to a bounded regression learning problem on a reproducing kernel Hilbert space. Partial Least Squares stands out of well-known classical approaches as e.g. Ridge Regression or Principal Components Regression, as it is not defined as the solution of a global cost minimization procedure over a fixed model nor ...
متن کاملSelection of the Optimal Wavebands for the Variety Discrimination of Chinese Cabbage Seed
This paper presents a method based on chemometrics analysis to select the optimal wavebands for variety discrimination of Chinese cabbage seed by using a Visible/Near-infrared spectroscopy (Vis/NIRS) system. A total of 120 seed samples were investigated using a field spectroradiometer. Chemometrics was used to build the relationship between the absorbance spectra and varieties. Principle compon...
متن کاملResearch on Several Problems in Partial Least Squares Regression Analysis
Purpose: preliminary discussion on model prediction precision in the partial least squares regression analysis method; Method: introduce current development conditions of partial least squares regression analysis, analyze problems of traditional regression analysis method such as multiple linear regression analysis, introduce the mathematic principle and modeling method of the partial least squ...
متن کامل